Model Selection

Low-resource inference optimization

# Low-resource inference optimization

Kodify Nano GGUF

Kodify-Nano-GGUF is the GGUF version of the Kodify-Nano model, optimized for CPU/GPU inference. It is a lightweight large language model suitable for code development tasks.

Large Language Model

Qwen3 30B A1.5B 64K High Speed NEO Imatrix MAX Gguf

An optimized version based on the Qwen3-30B-A3B Mixture of Experts model, improving speed by reducing the number of active experts, supporting 64k context length, and suitable for various text generation tasks.

Large Language Model Supports Multiple Languages

Qwen3 128k 30B A3B NEO MAX Imatrix Gguf

GGUF quantized version based on Qwen3-30B-A3B Mixture of Experts model, extended to 128k context, optimized with NEO Imatrix quantization technology, supporting multilingual and multitask processing.

Large Language Model Supports Multiple Languages

Llama 4 Scout 17B 16E Instruct Bnb 4bit

This is the quantized version of the original model meta-llama/Llama-4-Scout-17B-16E-Instruct, optimized with int4 quantization technology, suitable for multilingual tasks.

Large Language Model

Transformers Supports Multiple Languages

Llama 3.2 11B Vision Instruct GGUF

Llama-3.2-11B-Vision-Instruct is a multilingual vision-language model that can be used for image-text to text conversion tasks.

Transformers Supports Multiple Languages

Nvidia Llama 3.1 Nemotron 70B Instruct HF AWQ INT4

This is NVIDIA's AWQ 4-bit quantized version of the Llama-3.1-Nemotron-70B-Instruct model, customized based on Meta's Llama-3.1-70B-Instruct, focusing on improving the usefulness of generated responses.

Large Language Model

Transformers Supports Multiple Languages

Kunoichi DPO V2 7B GGUF Imatrix

A 7B-parameter large language model based on the Mistral architecture, trained with DPO (Direct Preference Optimization), demonstrating excellent performance in multiple benchmarks

Large Language Model

Speechless Coder Ds 6.7b

speechless-coder-ds-6.7b is a large language model fine-tuned based on deepseek-ai/deepseek-coder-6.7b, focusing on improving code generation and programming assistance capabilities.

Large Language Model

Transformers Supports Multiple Languages

GenZ is an advanced large language model fine-tuned from Meta's open-source Llama V2 70B parameter model, designed to provide high-performance text generation capabilities for the open-source community.

Large Language Model

Transformers English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase